Content Filtering Support Overview
Content Filtering support is an enhanced, or extended, premium service. The System Administration Guide provides basic system configuration information, and the product administration guides provide procedures to configure basic functionality of core network service. It is recommended that you select the configuration example that best meets your service model, and configure the required elements for that model, as described in the respective product
Administration Guide, before using the procedures in this chapter.
Content Filtering is an in-line service available for 3GPP and 3GPP2 networks to filter HTTP and WAP requests from mobile subscribers based on the URLs in the requests. This enables operators to filter and control the content that an individual subscriber can access, so that subscribers are inadvertently not exposed to universally unacceptable content and/or content inappropriate as per the subscribers’ preferences.
In the URL Blacklisting solution, all HTTP/WAP URLs in subscriber requests are matched against a database of “blacklisted” URLs. If there is a match, the flow is discarded, redirected, or terminated as configured. If there is no match, subscribers view the content as they would normally.
URL Blacklisting may/may not be a subscriber opt-in service, operators can enable URL Blacklisting either for all subscribers or for a subset of subscribers. Typical cases include applying a blacklisted database of child porn URLs to all subscribers so that they are inadvertently not exposed to such universally unacceptable content.
In Category-based Static Content Filtering, all HTTP/WAP URLs in subscriber requests are matched against a static URL categorization database. Action is taken based on a URL’s category, and the action configured for that category in the subscriber’s content filtering policy. Possible actions include permitting, blocking, redirecting, and inserting/altering content.
In Category-based Static-and-Dynamic Content Filtering, if static rating categorizes a URL as either “dynamic” or “unknown”, the “requested content” is sent for dynamic rating. Wherein the “requested content” is analyzed and categorized. Action is taken based on the category determined by dynamic rating, and the action configured for that category in the subscriber’s content filtering policy. Possible actions include permitting, blocking, redirecting, and inserting/altering content.
Typically Category-based Content Filtering is an opt-in service, subscribers self-choose a content-filtering policy or plan, such as Teen, Child, Adult, etc., and are subjected to content filtering as per their chosen plan. Also, the content-filtering policies of different subscribers may be different, enabling differential access of content to them. This solution provides maximum flexibility, and is also referred to as the Policy-based Content Filtering.
[600-00-7801] Blacklisting Integrated Service
[600-00-7586] Integrated Content Filtering Service, 1k Sessions
In the URL Blacklisting solution, a blacklist is a list of known URLs/URIs, which for some reason are being denied recognition. The blacklist can be obtained from a known source such as the
National Center for Missing & Exploited Children (
NCMEC, http://www. missing kids.com), or any other IP source. The blacklist is a clear text file, the file must be named cumulative.csv, and must use the same format as the blacklist file from
NCMEC. For more information on the blacklist file please contact your local service representative.
Unlike the Category-based Content Filtering solution, which categorizes URLs as per a static database and takes different actions based on the different policies associated with subscribers, URL Blacklisting is applicable to all subscribers associated with a blacklisting-enabled rulebase. The same blacklist database is used for all subscribers, and for a specific URL, the same action is taken for all subscribers.
The blacklist file is downloaded and converted into a non human-readable optimized format (OPTBLDB) and then made available in the system. Once in place, all HTTP and WAP requests from subscribers are inspected in order to determine the requested destination URL/URI. If the URL/URI is not present in the blacklist then the request is passed on as usual. If the URL/URI is present in the blacklist, the request is dropped, or the flow is redirected or terminated as configured. There is no indication/messaging sent to the requesting subscribers that the requested HTTP/WAP URL/URI was rejected due to a blacklist match.
The system generates usage/event data that can be utilized as the basis for blacklist reporting. The offline reports consist of, at a minimum, a running total of the number of times a match was made against the blacklist without any information regarding the specifics of the request.
|
l
|
Session Manager (SessMgr): A single SessMgr treats ECS charging and URL Blacklisting that is applicable to common subscriber sessions.
|
The WEM is a server-based application enabling complete element management of the system. The UNIX-based server application works with the network elements within the system using the Common Object Request Broker Architecture (CORBA) standard.
The URL Blacklisting solution includes the CF-CDP consisting of a geographically-redundant server hosting the CDP application that may be the same as the External Storage Server in ESS.
In the URL Blacklisting solution, the CF-CDP serves as a repository and distribution node for the Blacklisting database and its updates (OPTCMBLDB). There are no incremental updates in the URL Blacklisting feature as a new Full database will be supplied by WEM daily.
The new blacklist is loaded only if it has been received properly. If the full Blacklist database is not found, corrupted, or if the loading fails, traps are generated. Correspondingly clear traps are also generated on a valid Blacklist database being available, and after a successful load.
When the SessMgrs start for the first time or after recovery, if URL Blacklisting is set in any of the rulebases, the stored Blacklist database at SessCtrl is loaded onto the SessMgrs. This holds true for standby managers as well i.e., when standby managers come up the Blacklist database is loaded onto them.
Whenever a SessMgr is killed, standby manager which already has the Blacklist database loaded takes its place, and a new standby manager is created which loads the Blacklist database as part of SessMgr getting started for the first time.
If SessCtrl is killed, while recovering it checks if URL Blacklisting is set in any of the rulebases, if set it will store the Blacklist database onto itself and load all the SessMgrs as well.
The Category-based Content Filtering application is a fully integrated, subscriber-aware in-line service provisioned on ST-series Multimedia Core Platform running HA services. This application is transparently integrated within the ECS, and utilizes a distributed software architecture that scales with the number of active HA sessions on the system.
Content Filtering policy enforcement is the process of deciding if a subscriber should be able to receive some content. Typical options are to allow, block, or replace/redirect the content based on the rating of the content and the policy defined for that content. For the list of content categories, refer to the
Category List appendix in the
Content Filtering Services Administration Guide.
The Category-based Content Filtering solution enables operators to ensure a simplified end-to-end traffic flow with a simple network topology. In-line deployment of Content Filtering provides a more attractive solution in contrast to out-of-line solutions where the filtering and policy enforcement is provided at some offload point that is decoupled from the bearer-processing layer.
The out-of-line model forces a session to make multiple hops through a redundant array of equipment which has a negative impact on traffic latency and limits subscriber and network visibility. In addition, the out-of-line model requires all subscriber sessions to be steered to the adjunct Content Filtering platform for policy enforcement regardless of whether this additional processing is needed. This leads to increased bandwidth provisioning requirements on gateway routers.
To facilitate network simplicity, it makes sense to leverage the benefits of deep packet inspection at a single policy enforcement point that is tied to the bearer processing layer. The advantages of this approach implemented in include the following benefits:
|
l
|
Reduced processing latency: In-line service processing eliminates unnecessary hand-offs and forwarding to external network elements.
|
|
l
|
Simplified policy provisioning: Enables all policies like Content Filtering, ECS and QoS to be retrieved from same AAA/Policy Manager signaling interface thus reducing total volume of control transactions and associated delay.
|
|
l
|
Simplified provisioning and complete service integration: Provisioning of separate resources like PACs/PSCs for processing of subscriber data sessions and discrete services are eliminated. The same PAC/PSC CPU can contain active Session Manager tasks for running Content Filtering and ECS charging.
|
|
l
|
Service control: Precise control over the interaction and service order handling of bearer flows with required applications like Content Filtering, ECS, Subscriber-aware Stateful Firewall, integrated Policy Charging and Rules Function (PCRF) for Service Based Bearer Control.
|
Apart from the advantages described previously, Category-based Content Filtering service reduces the requirement of over-provisioning of capacity at neighboring gateway routers. It also eliminates requirements of external Server Load Balancers and enhances the accuracy in subscriber charging records.
With Static Category-based Content Filtering, the filtering is only as good as the collection of URLs in the database. Even the largest URL database covers only a fraction of the Surface Web and virtually none of the Deep Web. It is quite impossible to find, review, and categorize enough of the available Web sites to keep the database current.
Also, many mobile sites are classified as dynamic sites. A dynamic site may return either acceptable or inappropriate content from the same URL. For example, search engines, news portals, or auction sites that return variable results depending upon subscriber requests.
When the Content Filtering subsystem receives a request for dynamic content it becomes necessary to categorize pages in real-time to determine how to classify content the provider is delivering at that moment. The “Static Rating” solution that relies exclusively on previously categorized rating for sites may fail to categorize dynamic sites appropriately.
Dynamic Content Filtering enables on-the-fly content analysis of Web traffic using different content analysis techniques. When a Web page is received, it is analyzed and then categorized according to the content found in the page. Whether a Web site has existed for five months or for five minutes does not matter since determination of the category to which the Web page belongs is made just at the time of request. Therefore, dynamic filters have no problem keeping up with the growth and changing content of the Internet. A combination of static filtering and dynamic inspection provides real accuracy and scalability as the Web weaves an increasingly sophisticated network of sites.
In Static-and-Dynamic Content Filtering every URL will first undergo static rating, if the URL cannot be rated by the static database, or if the URL’s statistic rating is categorized as DYNAM, then it will go for dynamic rating. After the content has been analyzed, as with static content filtering, dynamic rating actions include acceptance, blocking, redirection, and/or replacement of content.
Static-and-Dynamic Content Filtering must be enabled at the global and rulebase levels. Before enabling static-and-dynamic rating in the rulebase, it must be enabled at the global level as the resources required for dynamic rating are allocated at the global level. When enabled in a rulebase, it is applied for subscribers using that rulebase.
The Category-based Content Filtering solution is provided as an integrated subsystem within the Enhanced Charging Service (ECS). Although it is not necessary to provision content-based charging in conjunction with content filtering, it is highly desirable as it enables a single point of deep-packet inspection for both services. It also enables a single policy decision and enforcement point for both services thereby streamlining the required number of signaling interactions with external AAA/Policy Manager servers. Utilizing both services also increases billing accuracy of charging records by insuring that mobile subscribers are only charged for visited sitescontent.
The Category-based Content Filtering solution uses Content Filtering Policy to analyze the content requested by subscribers. Content Filtering Policy provides a decision point for analyzed content on the basis of its category and priority.
The Category-based Content Filtering solution also utilizes ACS rulebases in order to determine the correct policy decision and enforcement action such as accept, block, redirect, or replace. Rulebase names are retrieved during initial authentication from the AAA/Policy Manager. Some possible examples of rulebase names include Consumer, Enterprise, Child, Teen, Adult, and Sport. Rulebase names are used by the ACS subsystem to instantiate the particular rule definition that applies for a particular session. Rulebase work in conjunction with a content filtering policy and only one content filtering policy can be associated with a rulebase.
The ACS subsystem includes L3–L7 deep packet inspection capabilities. It correlates all L3 packets with higher layer criteria such as URL detection within an HTTP header, it also provides stateful packet inspection for complex protocols like FTP, RTSP, and SIP that dynamically open ports for the data path.
|
l
|
Session Manager (SessMgr): A single SessMgr treats ECS charging and Content Filtering that is applicable to common subscriber sessions.
|
Figure 1-2 shows these components and their location in the high-level Category-based Content Filtering architecture.
This is an internal categorization database (periodically synchronized with an external server) that provides ratings for publicly accessible traditional and mobile Web sites. When the SessMgr passes a URL/URI to internal list server, the list server returns a list of matching category ratings.
The list server is used to determine whether a Web site has already been classified. When the list server passes back a category rating to the filtering application, the rating is compared against the Category Policy ID applied for the subscriber to determine the appropriate action like accept, block, redirect, or replace. If the list server returns a clean rating, there is no need to perform a real-time analysis of any content delivered by the site.
When a blocked or rejected content rating is returned, the SessMgr can insert data such as a redirect server address into the bearer data stream. If no rating is returned this means the site is capable of returning either clean or unacceptable content. In this case, the Content Filtering application uses the real-time dynamic analysis engine to examine additional content served by the site.
Each SRDB contains a replication object consisting of hash tables that map known Web sites and their subdirectories to their respective category ratings. The SessCtrl reads the index of SRDB tables with a data structure that associates keys with URL rating values and loads it onto the SRDB managers.
To boost performance and provide high availability, SRDB Manager provides functionality to load the Optimized Content Rating Master Database (OPTCMDB) volumes from its peer SRDB task. If the peer SRDB task is not in loading state then the OPTCMDB loading is done through SessCtrl to the recovered SRDB task.
First one Dynamic SRDB task is created with two Standby SRDBs on the same CPU, then eight Static SRDBs are created, and then more Dynamic SRDB tasks are created if memory is available. A Dynamic SRDB is always created with two Standby SRDBs with it on the same CPU.
The Dynamic Rater Package, which contains the model files (used for language detection and category recognition) and feature counters (used to decide whether or not to evaluate the Web page against the respective model file) are loaded on the SRDB Managers. The rater package is loaded only on the active SRDB and not on the standbys.
The Rater package containing the model files (used for language detection and category recognition) and feature counters (used to decide whether or not to evaluate the Web page against the respective model file) is stored at
pcmcia1/cf. After loading the static database, ACS will read the Rater package and load it onto the Dynamic SRDB Managers.
The rater package loaded onto SRDBs can be upgraded using an upgrade CLI, that will look for the upgrade file in the form “
rater_f.pkg” at a specific location and if found load the new package onto the SRDBs. On successful loading, the “
rater_f.pkg” is replaced with “
rater.pkg” and versioned. In case of loading or upgrade failures, appropriate traps are generated.
The real-time analyzer requires a model file that defines the features which are necessary to classify a Web page as belonging to a specific category and language. A model file per category is created by analyzing the traits of thousands of pages of that category and thousands of pages that does not belong to that category. For some categories, a feature counter file is used to decide whether or not to evaluate the Web page against the respective model file.
When URL Blacklisting solution is the only content filtering enabled on a system, no SRDB tasks are spawned at startup. Only when either Category-based Content Filtering is enabled in isolation, or with URL Blacklisting, the SRDB tasks are spawned.
This is a third-party content rating solution for exporting content filtering rules database information to the Category-based Content Filtering system. In addition, while exporting database updates, it collects reports of URLs processed by ECS and Content Filtering services that are reported as unknown in the deployed static rating database. This server analyzes these URLs and provides the rating in future updates for static rating database.
The Category-based Content Filtering solution provides a Master Content Rating Database Server to convert the VFMDB to Starent Networks’ Format Master Database (SFMDB). It handles both full and incremental updates and processes them on a configured schedule.
This sever is also responsible for distribution of Starent Networks’ format data files to WEM servers in the Starent Networks’ customer support infrastructure on a configured interval.
The L-ESS and R-ESS are ECS applications running on redundant highly available servers that collect and process EDRs and UDRs from which billing events and reports are generated. Either the system pushes the EDR/UDR files to the L-ESS, or the L-ESS fetches them from the system and processes them into formats suitable for billing mediation servers, Report Engine at CDP, and the R-ESS. The R-ESS consolidates the processed EDR/UDR files into a database for report generation through UDR Report Generator or EDR Report Generator at CDP.
When Content Filtering is deployed in conjunction with ECS, the operator has an option of collocating the Central Decision Point (CF-CDP) with an ESS thereby eliminating the requirement of the CF-CDP to fetch a copy of the CF-EDRs from each system running ECS and Content Filtering service. The database generated on an ESS by processing EDR/UDR records is a superset of the database required by a CF-CDP.
The function of the RADIUS Server/Policy Manager in the Content Filtering solution is to provide per-subscriber Content Filtering provisioning information when a subscriber’s session is established. It can also issue a Change-of-Authorization (CoA) to update an in-progress session to modify the Content Filtering policy for a subscriber.
The Content Filtering solution provides a GUI to provide information to support staff in the operator’s customer-care support center. This interface allows support personnel to quickly address subscribers questions or concerns about policies on their account.
The WEM is a server-based application providing complete element management of the system. The UNIX-based server application works with the network elements within the system using the Common Object Request Broker Architecture (CORBA) standard.
The Category-based Content Filtering solution also includes a Central Decision Point (CF-CDP) with Report Engine that can be integrated in an R-ESS of ECS Storage System. The CF-CDP consists of a geographically redundant server hosting the CDP application that may be the same as Remote-External Storage Server in ESS.
The primary activity of the CF-CDP is to process CF-EDRs into a database that supports report generation by the WEM, and query processing by the Customer-Care Management application.
In addition, the CF-CDP serves as a repository and distribution node for the optimized content rating master database and incremental updates (OPTCMDB/OPTCMDB-INC). CF-System uses the OPTCMDB to perform the actual Content Rating Function on subscriber traffic. The OPTCMDB database is also used by an instance of the static and dynamic rating engines running on the CF-CDP to provide a functional rating service that is leveraged by the Customer-Care application to display the static rating of a URL and/or corresponding dynamic rating for the URL’s content.
The CF-CDP server provides updated configuration files to the CF-System with the latest revisions to the static categorization database. The Content Filtering application also provides a mechanism to properly distinguish between release versions. The configuration updates are securely transmitted via SFTP over SSH via the out-of-band management network to a SPIO interface and the local management context of the chassis.
Updates to the CF-System occurs on requests or configured periodicity. To further reduce the volume of traffic over the management network, instead of retransmitting the entire SRDB at each update, it is also possible to send small incremented differential files that include only the additional URLs that were added since the previous update.
The Report Generator utility in CF-CDP is a script-based tool responsible for report generation and CF-EDR parsing. A script-based utility generates reports in XML format for content filtering subscribers and the Report Engine server takes care of EDR parsing. The script can be used with
cron job for periodic report generation in background. Reports can also be generated from the WEM GUI.
|
l
|
Overall Summary Report: This is a short summary report of all the activities done between a duration of time. This report includes following schema for subscriber activity summary report:
|
|
l
|
Subscriber Detailed Report: This is a detailed report in which operators get detailed information about subscriber activities on a Content Filtering system. This report includes all information about a subscriber’s request with following schema for each URL requested:
|
|
l
|
URL Summary Report: This is a high-level report which provides the list of URLs and the number of times a URL was visited with the following additional schema along with the schema of Summary Report:
|
The Content Filtering Subsystem integrated into the ACS subsystem consists of two primary elements; an onboard static categorization database and a dynamic rating engine. The filtering service uses the Deep Packet Inspection (DPI) capabilities of the ACS subsystem to classify and partition application or protocol specific flows into virtual sessions.
Content analyzers are used to identify various types of flows such as HTTP, MMS/WAP, and POP3 E-mail. A typical HTTP request for a Web page, for example, invokes TCP and HTTP traffic analyzers. Any HTTP field including URLs or URIs can be identified. When a subscriber session is bound by CSS to an ECS running content filtering service, the URL/URI is extracted and compared against the static categorization database.
|
l
|
Filter ID or Access Control List Name: Applied to subscriber session. It typically contains the name of the Content Service Steering (CSS) ACL. The CSS ACL establishes the particular service treatments such as Content Filtering, ECS, Traffic Performance Optimization, Stateful Firewall, VPN, etc. to apply to a subscriber session and the service order sequence to use in the inbound or outbound directions. Real-time or delay sensitive flows are directly transmitted to the Internet with no further processing required. In this case, no CSS ACL or Filter ID is included in the Access Response.
|
|
l
|
SN-CF-Category-Policy: Applied to the subscriber content flow. Policy ID included in this attribute overrides the policy identifier applied to subscriber through rulebase or APN/Subscriber configuration. This content filtering policy determines the action to be taken on a content request from subscriber on the basis of its category. At anytime only one content filtering policy can be associated with a rulebase.
|
|
l
|
SN1-Rulebase Name: This custom attribute contain information such as consumer, business name, child/adult/teen, etc.). The rulebase name identifies the particular rule definitions to apply. Rulebase definitions are used in ECS as the basis for deriving charging actions such as prepaid/postpaid volume/duration/destination billing and charging data files (EDRs/UDRs). Rulebase definitions are also used in content filtering to determine whether a type of user class such as teenagers should be permitted to receive requested content belonging to a particular type of category such as adult entertainment, gambling or hate sites. Rulebase definitions are generated in the Active Charging Configuration Mode and can be applied to individual subscribers, to domains or on per-context basis.
|
Within the system, if the bearer flow is treated by Content Filtering or other in-line services, the SessMgr feeds it to the Content Service Steering (CSS) API. If Content Filtering is the first service touch point, TCP and HTTP traffic analyzers within a given SessMgr utilize deep-packet inspection to extract the requested URL.
When only Static Content Filtering is enabled, first the URL is looked-up in the cache maintained at SessMgr for static URL requests, if there is a hit, the category is returned, if its a miss, a URL look-up is performed by an onboard SRDB for static rating.
Handling for concatenated and pipelined responses is the same as in Static Content Filtering. The action taken is based on the highest priority category among the pipelined responses.
Content Filtering EDRs are the same as for static rating. However if static rating fails and the request goes for dynamic rating, then Content Filtering EDRs will be generated only after dynamic rating has been completed and not when static rating failed.
Both URL Blacklisting and Category-based Content Filtering can be concurrently enabled in a system. The following describes how URL blacklisting and content filtering are performed on HTTP/WAP traffic when concurrently enabled on a system:
When an HTTP/WAP request comes for ACS processing, a check is made to see if the URL Blacklisting feature is enabled. If enabled, the URL is extracted from the incoming request and is matched with the local Blacklist database.
ECS supports the streamlined ICAP interface to leverage Deep Packet Inspection to enable external application servers to provide their services without performing DPI, and without being inserted in the data flow. For example, with an external Active Content Filtering (ACF) platform.
If a subscriber initiates a WAP (WAP1.x or WAP2.0) or Web session, the subsequent GET/POST request is detected by the DPI function. The URL of the GET/POST request is extracted and passed, along with subscriber identification information and the subscriber request, in an ICAP message to the application server.
In the case of Category-based Content Filtering solution, the application server checks the URL on the basis of its category and other classifications like type, access level and content category and decides if the request should be authorized, blocked, or redirected by answering to the GET/POST with:
Depending on the response received, the system with ECS will either pass the request unmodified, or discard the message, and respond to the subscriber with the appropriate redirection or block message.
Content Charging is performed by the ACS only after the request has been controlled by the application server. This guarantees the appropriate interworking between the external application and content-based billing. In particular, this guarantees that charging will be applied to the appropriate request in case of redirection, and that potential charging based redirections (i.e. Advice of Charge, Top Up page, etc.) will not interfere with the decisions taken by the application server.
For information on configuring the ICAP interface support for external ACF servers, refer to the
ICAP Interface Support chapter of the
System Enhanced Feature Configuration Guide.
To store generated xDR files, on the ST40 chassis, the system allocates 512 MB of memory on the PSC RAM, and on the ST16 chassis, 256 MB on the PAC RAM. The generated xDRs are stored in CSV format in the
/records directory on the PAC/PSC RAM. These generated xDRs can be used for billing as well as for generation of reports to analyze network usage and subscriber trends.
As this temporary storage space (size configurable) reaches its limit, the system deletes older xDRs to make room for new xDRs. Setting gzip file compression extends the storage capacity by approximately 10:1.
Because of the volatile nature of the memory, xDRs can be lost due to overwriting, deletion, or unforeseen events such as power or network failure or unplanned chassis switchover. To avoid loosing charging and network analysis information, configure the CDR subsystem in conjunction with the External Storage System (ESS) to offload the xDRs for storage and analysis.
The ST-series Multimedia Core Platform requires the following additional hardware and memory to handle the Content Rating Master Databases; for example, for Category-based Content Filtering OPTCMDB. The memory required may vary with the size of rating databases used for content rating service.